Stochastic Dropout: Activation-level Dropout to Learn Better Neural Language Models

نویسنده

  • Allen Nie
چکیده

Recurrent Neural Networks are very powerful computational tools that are capable of learning many tasks across different domains. However, it is prone to overfitting and can be very difficult to regularize. Inspired by Recurrent Dropout [1] and Skip-connections [2], we describe a new and simple regularization scheme: Stochastic Dropout. It resembles the structure of recurrent dropout, but offers skip-connection over the recurrent depth. We reason the theoretical construct of such method and compare its regularization effectiveness with feedforward dropout and recurrent dropout. We demonstrate that Stochastic Dropout not only offers improvement when applied to vanilla RNN models, but also outperforms feedforward Dropout on word-level language modeling. At last, we show that the model can achieve even better result if stochastic dropout and feedforward dropout are combined.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dropout training for Hidden Unit CRFs

A very commonly faced issue while training prediction models using machine learning is overfitting. Dropout is a recently developed technique designed to counter this issue in deep neural networks and has also been extended to other algorithms like SVMs. In this project, we formulate and study the application of Dropout to Hidden Unit Conditional Random Fields (HUCRFs). HUCRFs use binary stocha...

متن کامل

Analysis on the Dropout Effect in Convolutional Neural Networks

Regularizing neural networks is an important task to reduce overfitting. Dropout [1] has been a widely-used regularization trick for neural networks. In convolutional neural networks (CNNs), dropout is usually applied to the fully connected layers. Meanwhile, the regularization effect of dropout in the convolutional layers has not been thoroughly analyzed in the literature. In this paper, we an...

متن کامل

Designinga Neuro-Sliding Mode Controller for Networked Control Systems with Packet Dropout

This paper addresses control design in networked control system by considering stochastic packet dropouts in the forward path of the control loop. The packet dropouts are modelled by mutually independent stochastic variables satisfying Bernoulli binary distribution. A sliding mode controller is utilized to overcome the adverse influences of stochastic packet dropouts in networked control system...

متن کامل

DropLasso: A robust variant of Lasso for single cell RNA-seq data

Single-cell RNA sequencing (scRNA-seq) is a fast growing approach to measure the genome-wide transcriptome of many individual cells in parallel, but results in noisy data with many dropout events. Existing methods to learn molecular signatures from bulk transcriptomic data may therefore not be adapted to scRNA-seq data, in order to automatically classify individual cells into predefined classes...

متن کامل

Gaussian Process Neurons Learn Stochastic Activation Functions

We propose stochastic, non-parametric activation functions that are fully learnable and individual to each neuron. Complexity and the risk of overfitting are controlled by placing a Gaussian process prior over these functions. The result is the Gaussian process neuron, a probabilistic unit that can be used as the basic building block for probabilistic graphical models that resemble the structur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016